Search CORE

64 research outputs found

Human Motion Trajectory Prediction: A Survey

Author: Arras Kai O.
Gavrila Dariu M.
Herman Michael
Kitani Kris M.
Palmieri Luigi
Rudenko Andrey
Publication venue: 'SAGE Publications'
Publication date: 17/12/2019
Field of study

With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

arXiv.org e-Print Archive

Metric-Scale Truncation-Robust Heatmaps for 3D Human Pose Estimation

Author: Arras Kai O.
Leibe Bastian
Linder Timm
Sárándi István
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Heatmap representations have formed the basis of 2D human pose estimation systems for many years, but their generalizations for 3D pose have only recently been considered. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and the Z axis to metric depth around the subject. To obtain metric-scale predictions, these methods must include a separate, explicit post-processing step to resolve scale ambiguity. Further, they cannot encode body joint positions outside of the image boundaries, leading to incomplete pose estimates in case of image truncation. We address these limitations by proposing metric-scale truncation-robust (MeTRo) volumetric heatmaps, whose dimensions are defined in metric 3D space near the subject, instead of being aligned with image space. We train a fully-convolutional network to estimate such heatmaps from monocular RGB in an end-to-end manner. This reinterpretation of the heatmap dimensions allows us to estimate complete metric-scale poses without test-time knowledge of the focal length or person distance and without relying on anthropometric heuristics in post-processing. Furthermore, as the image space is decoupled from the heatmap space, the network can learn to reason about joints beyond the image boundary. Using ResNet-50 without any additional learned layers, we obtain state-of-the-art results on the Human3.6M and MPI-INF-3DHP benchmarks. As our method is simple and fast, it can become a useful component for real-time top-down multi-person pose estimation systems. We make our code publicly available to facilitate further research (see https://vision.rwth-aachen.de/metro-pose3d).Comment: Accepted for publication at the 2020 IEEE Conference on Automatic Face and Gesture Recognition (FG

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation

Author: Arras Kai O.
Leibe Bastian
Linder Timm
Sárándi István
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/11/2020
Field of study

Heatmap representations have formed the basis of human pose estimation systems for many years, and their extension to 3D has been a fruitful line of recent research. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and Z to metric depth around the subject. To obtain metric-scale predictions, 2.5D methods need a separate post-processing step to resolve scale ambiguity. Further, they cannot localize body joints outside the image boundaries, leading to incomplete estimates for truncated images. To address these limitations, we propose metric-scale truncation-robust (MeTRo) volumetric heatmaps, whose dimensions are all defined in metric 3D space, instead of being aligned with image space. This reinterpretation of heatmap dimensions allows us to directly estimate complete, metric-scale poses without test-time knowledge of distance or relying on anthropometric heuristics, such as bone lengths. To further demonstrate the utility our representation, we present a differentiable combination of our 3D metric-scale heatmaps with 2D image-space ones to estimate absolute 3D pose (our MeTRAbs architecture). We find that supervision via absolute pose loss is crucial for accurate non-root-relative localization. Using a ResNet-50 backbone without further learned layers, we obtain state-of-the-art results on Human3.6M, MPI-INF-3DHP and MuPoTS-3D. Our code will be made publicly available to facilitate further research.Comment: See project page at https://vision.rwth-aachen.de/metrabs . Accepted for publication in the IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM), Special Issue "Selected Best Works From Automated Face and Gesture Recognition 2020". Extended version of FG paper arXiv:2003.0295

arXiv.org e-Print Archive

Plug-and-Play SLAM: A Unified SLAM Architecture for Modularity and Ease of Use

Author: Aloise Irvin
Arras Kai O.
Colosi Mirco
Della Corte Bartolomeo
Grisetti Giorgio
Guadagnino Tiziano
Schlegel Dominik
Publication venue
Publication date: 01/01/2020
Field of study

Nowadays, SLAM (Simultaneous Localization and Mapping) is considered by the Robotics community to be a mature field. Currently, there are many open-source systems that are able to deliver fast and accurate estimation in typical real-world scenarios. Still, all these systems often provide an ad-hoc implementation that entailed to predefined sensor configurations. In this work, we tackle this issue, proposing a novel SLAM architecture specifically designed to address heterogeneous sensors' configuration and to standardize SLAM solutions. Thanks to its modularity and to specific design patterns, the presented architecture is easy to extend, enhancing code reuse and efficiency. Finally, adopting our solution, we conducted comparative experiments for a variety of sensor configurations, showing competitive results that confirm state-of-the-art performance

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza